class: center, middle, inverse, title-slide .title[ # APEC8211: Recitation 2 ] .author[ ### Shunkei Kakimoto ] --- class: middle <style type="text/css"> .remark-slide-number { display: none; } .remark-slide-content.hljs-github h1 { margin-top: 5px; margin-bottom: 25px; } .remark-slide-content.hljs-github { padding-top: 10px; padding-left: 30px; padding-right: 30px; } .panel-tabs { <!-- color: #062A00; --> color: #841F27; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; padding-bottom: 0px; } .panel-tab { margin-top: 0px; margin-bottom: 0px; margin-left: 3px; margin-right: 3px; padding-top: 0px; padding-bottom: 0px; } .panelset .panel-tabs .panel-tab { min-height: 40px; } .remark-slide th { border-bottom: 1px solid #ddd; } .remark-slide thead { border-bottom: 0px; } .gt_footnote { padding: 2px; } .remark-slide table { border-collapse: collapse; } .remark-slide tbody { border-bottom: 2px solid #666; } .important { background-color: lightpink; border: 2px solid blue; font-weight: bold; } .remark-code { display: block; overflow-x: auto; padding: .5em; background: #ffe7e7; } .remark-code, .remark-inline-code { font-family: 'Source Code Pro', 'Lucida Console', Monaco, monospace;font-size: 50%; } .hljs-github .hljs { background: #f2f2fd; } .remark-inline-code { padding-top: 0px; padding-bottom: 0px; background-color: #e6e6e6; } .r.hljs.remark-code.remark-inline-code{ font-size: 0.9em } .left-full { width: 80%; float: left; } .left-code { width: 38%; height: 92%; float: left; } .right-plot { width: 60%; float: right; padding-left: 1%; } .left6 { width: 60%; height: 92%; float: left; } .left5 { width: 49%; <!-- height: 92%; --> float: left; } .right5 { width: 49%; float: right; padding-left: 1%; } .right4 { width: 39%; float: right; padding-left: 1%; } .left3 { width: 29%; height: 92%; float: left; } .right7 { width: 69%; float: right; padding-left: 1%; } .left4 { width: 38%; float: left; } .right6 { width: 60%; float: right; padding-left: 1%; } ul li{ margin: 7px; } ul, li{ margin-left: 15px; padding-left: 0px; } ol li{ margin: 7px; } ol, li{ margin-left: 15px; padding-left: 0px; } </style> <style type="text/css"> .content-box { box-sizing: border-box; background-color: #e2e2e2; } .content-box-blue, .content-box-gray, .content-box-grey, .content-box-army, .content-box-green, .content-box-purple, .content-box-red, .content-box-yellow { box-sizing: border-box; border-radius: 5px; margin: 0 0 10px; overflow: hidden; padding: 0px 5px 0px 5px; width: 100%; } .content-box-blue { background-color: #F0F8FF; } .content-box-gray { background-color: #e2e2e2; } .content-box-grey { background-color: #F5F5F5; } .content-box-army { background-color: #737a36; } .content-box-green { background-color: #d9edc2; } .content-box-purple { background-color: #e2e2f9; } .content-box-red { background-color: #ffcccc; } .content-box-yellow { background-color: #fef5c4; } .content-box-blue .remark-inline-code, .content-box-blue .remark-inline-code, .content-box-gray .remark-inline-code, .content-box-grey .remark-inline-code, .content-box-army .remark-inline-code, .content-box-green .remark-inline-code, .content-box-purple .remark-inline-code, .content-box-red .remark-inline-code, .content-box-yellow .remark-inline-code { background: none; } .full-width { display: flex; width: 100%; flex: 1 1 auto; } </style> <style type="text/css"> blockquote, .blockquote { display: block; margin-top: 0.1em; margin-bottom: 0.2em; margin-left: 5px; margin-right: 5px; border-left: solid 10px #0148A4; border-top: solid 2px #0148A4; border-bottom: solid 2px #0148A4; border-right: solid 2px #0148A4; box-shadow: 0 0 6px rgba(0,0,0,0.5); /* background-color: #e64626; */ color: #e64626; padding: 0.5em; -moz-border-radius: 5px; -webkit-border-radius: 5px; } .blockquote p { margin-top: 0px; margin-bottom: 5px; } .blockquote > h1:first-of-type { margin-top: 0px; margin-bottom: 5px; } .blockquote > h2:first-of-type { margin-top: 0px; margin-bottom: 5px; } .blockquote > h3:first-of-type { margin-top: 0px; margin-bottom: 5px; } .blockquote > h4:first-of-type { margin-top: 0px; margin-bottom: 5px; } .text-shadow { text-shadow: 0 0 4px #424242; } </style> <style type="text/css"> /****************** * Slide scrolling * (non-functional) * not sure if it is a good idea anyway slides > slide { overflow: scroll; padding: 5px 40px; } .scrollable-slide .remark-slide { height: 400px; overflow: scroll !important; } ******************/ .scroll-box-8 { height:8em; overflow-y: scroll; } .scroll-box-10 { height:10em; overflow-y: scroll; } .scroll-box-12 { height:12em; overflow-y: scroll; } .scroll-box-14 { height:14em; overflow-y: scroll; } .scroll-box-16 { height:16em; overflow-y: scroll; } .scroll-box-18 { height:18em; overflow-y: scroll; } .scroll-box-20 { height:20em; overflow-y: scroll; } .scroll-box-24 { height:24em; overflow-y: scroll; } .scroll-box-30 { height:30em; overflow-y: scroll; } .scroll-output { height: 90%; overflow-y: scroll; } </style> # Outline Review some concepts related to random variables <!-- # main --> [1. CDF, PDF, PMF (Quick review)](#dist) + [Exercise problems](#ex1) <!-- # To explain Jensen's inequality --> [2. Mean and variance (Quick review)](#mean) + [Exercise problems](#ex2) [3. Jensen's inequality (Quick review)](#jensen) --- class: inverse, center, middle name: dist # CDF, PDF, and PMF <html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html> --- .content-box-red[**Distribution function**] + Cumulative distribution function (CDF) + Definition: <span style="color:red">The CDF of a random variable `\(X\)` is `\(F(x) = Pr[X \leq x]\)`</span> + **Verbally**: CDF `\(F(x)\)` tells us the probability of the event that random variable `\(X\)` is less than a value `\(x\)`. .left5[ <img src="data:image/png;base64,#recitation2_slides_files/figure-html/unnamed-chunk-5-1.png" width="100%" style="display: block; margin: auto;" /> ] .right5[ <img src="data:image/png;base64,#recitation2_slides_files/figure-html/unnamed-chunk-6-1.png" width="100%" style="display: block; margin: auto;" /> ] --- <!-- PDF and PMF (1) --> .content-box-red[**Probability mass function (Discrete random variables)**] + **Definition**: `\(\color{red}{\pi(x) = Pr[X = x]}\)` + **Verbally**: The probability that `\(X\)` equals the value `\(x\)` <br> .content-box-red[**Probability density function (Continuous random variables)**] + **Definition**: `\(\color{red}{f(x) = \frac{d}{dx}F(x)} \quad ( = \displaystyle \lim_{h\to\infty} \frac{F(x+h)-F(x)}{h})\)` + **Verbally**: Density function is a very small change in the CDF (or the probability of the random variable falling within a particular range of values according to [wikipedia](https://en.wikipedia.org/wiki/Probability_density_function)). --- <!-- PDF and PMF (2) --> .content-box-red[**Probability mass function (Discrete random variables)**] + **Definition**: `\(\color{red}{\pi(x) = Pr[X = x]}\)` + **Verbally**: The probability that `\(X\)` equals the value `\(x\)` <br> .content-box-red[**Probability density function (Continuous random variables)**] + **Definition**: `\(\color{red}{f(x) = \frac{d}{dx}F(x)} \quad ( = \displaystyle \lim_{h\to\infty} \frac{F(x+h)-F(x)}{h})\)` + **Verbally**: Density function is a very small change in the CDF (or the probability of the random variable falling within a particular range of values according to [wikipedia](https://en.wikipedia.org/wiki/Probability_density_function)). <br> .content-box-red[**Theorem 2.3: Properties of a PDF**] A function f(x) is a density function **if and only if** `$$\begin{cases} f(x) \ge 0 \text{ for all } x \\ \int_{-\infty}^\infty f(x)\,dx = 1 \end{cases}$$` <!-- if you are asked to show that a function f(x) is a valid density function, check whether f(x) satisfies these properties or not. --> --- class: middle .content-box-green[**Relationship between CDF and PDF**] + **From CDF to PDF**: `\(f(x) = \frac{d}{dx}F(x)\)` </br> (by definition of PDF) + **From PDF to CDF**: `\(F(x) = Pr(X \leq x) = \int_{-\infty}^x f(t) dt\)` </br> (as shown below) <img src="data:image/png;base64,#recitation2_slides_files/figure-html/unnamed-chunk-18-1.gif" width="80%" style="display: block; margin: auto;" /> --- name: ex1 # Exercise 1 .content-box-green[**Final Exam: 2021: Problem 1**] Define `\(\Phi(z)\)` as the CDF of a standard normal random variable and `\(\phi(z)\)` as its density function. (a) Write `\(Pr(Z \leq b)\)` usinzg `\(\Phi()\)`. (b) Write `\(Pr(Z \leq b)\)` as an integral. (c) Write `\(Pr(a \leq Z \leq b)\)` using `\(\Phi()\)`. (d) Write `\(Pr(a \leq Z \leq b)\)` as an integral. --- # Exercise 2 .content-box-green[**PSE Exercise 2.1**] Let `\(X \sim U[0,1]\)`. Find the PDF of random variable `\(Y=X^2\)`. --- class: inverse, center, middle name: mean # Mean and Variance and Jensen's inequality <html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html> --- <!-- mean and variance 1 --> **Definition** 2.18, 2.19: + The mean of `\(X\)` is<span style='color:red'> `\(E[X]\)`</span> + The variance of `\(X\)` is <span style='color:red'> `\(Var[X]=E[(X-E[X])^2]\)`</span> --- <!-- mean and variance 2 --> **Definition** 2.18, 2.19: + The mean of `\(X\)` is<span style='color:red'> `\(E[X]\)`</span> + The variance of `\(X\)` is <span style='color:red'> `\(Var[X]=E[(X-E[X])^2]\)`</span> .content-box-green[Visualization] .panelset[ .panel[.panel-name[Code 1: Mean] ```r # /*===========================================*/ #'= Mean of X = # /*===========================================*/ # /*===== Data generation =====*/ # --- Set the range of X --- # x_left <- 2 x_right <- 10 x_center <- (x_left + x_right)/2 # --- Creating data --- # x <- seq(x_left, x_right, length = 1000) y <- dnorm(x, mean = x_center, sd = 1) plot_data <- data.table(x = x, y = y) # /*===== Visualization =====*/ plot_mean_X <- ggplot(data = plot_data) + geom_line(aes(x = x, y = y), color = "black")+ labs(y="Density", title="Density of X ~ N(6,1)") + geom_vline(xintercept=x_center, color="red") + annotate("text", x = 7, y = 0.01, label = "Mean E[X]=6", size = 3, color="red") + theme_bw()+ theme(plot.title = element_text(hjust = 0.5)) ``` ] .panel[.panel-name[Code 2: Variance] ```r # /*===========================================*/ #'= Variance = # /*===========================================*/ # /*===== Data generation =====*/ # --- a list of the values of Var[X] --- # ls_var <- c(1,4,9) # --- Create an object to save results--- # ls_res <- vector(mode='list', length=length(ls_var)) # --- Range of X --- # x_left <- 2 x_right <- 10 x_center <- (x_left + x_right)/2 # --- Generate data by the value in ls_var --- # for (i in seq(1:length(ls_var))){ var <- ls_var[i] x <- seq(x_left, x_right, length = 1000) y <- dnorm(x, mean = x_center, sd = sqrt(var)) ls_res[[i]] <- data.table(var = var, x = x, y = y) } res_total <- rbindlist(ls_res) # /*===== Visualization =====*/ plot_var_X <- ggplot(data = res_total) + geom_line(aes(x = x, y = y, color = interaction(var)))+ labs(y = "Density", title = expression("Density of X ~ N(6," ~ sigma^2 ~ ") with various " ~ sigma^2)) + guides(color = guide_legend(title= expression(sigma^2 ~ "=")))+ theme_bw() + theme(plot.title = element_text(hjust = 0.5)) ``` ] .panel[.panel-name[Graphs] .left5[ <img src="data:image/png;base64,#recitation2_slides_files/figure-html/unnamed-chunk-10-1.png" width="100%" style="display: block; margin: auto;" /> ] .right5[ <img src="data:image/png;base64,#recitation2_slides_files/figure-html/unnamed-chunk-11-1.png" width="100%" style="display: block; margin: auto;" /> ] ] ] --- ## E[] and Var[] as operators .content-box-red[**Expectation**] <span style="color:red"> `\(E[ \,]\)` is a linear operator (Linearity of expectation)</span> For any constants `\(a\)` and `\(b\)`, `$$E[a+bX] = a + bE[X]$$` --- ## E[ ] and Var[ ] as operators .content-box-red[**Expectation: E[]**] <span style="color:red"> `\(E[ \, ]\)` is a linear operator (i.e. Linearity of expectation)</span> For any constants `\(a\)` and `\(b\)`, `$$E[a+bX] = a + bE[X]$$` .content-box-red[**Variance: Var[]**] `$$\begin{align*} \text{(1) } &Var[X] = E[X^2] - (E[X])^2 \\ \text{(2) } &Var[a+bX] = b^2E[X] \end{align*}$$` .content-box-green[**Question**] Prove (1) and (2) using linearity of expectation. --- name: jensen # Jensen's inequality (This is not the proof) In the previous slide, we saw `\(Var[X]=E[(X-E[X])^2]=E[X^2] - (E[X])^2\)`. So, `\(Var[X] \ge 0\)` (by the way, `\(Var[X]=0\)` if and only if `\(X\)` is degenerate). So, `$$E[X^2] - (E[X])^2 \ge 0$$` <p style="text-align: center;">or</p> `$$(E[X])^2 \leq E[X^2]$$` Define `\(g(x)=x^2\)`. Then it is written as `$$g(E[X]) \leq E[g(X)]$$` Generally, `$$\begin{align*} g(E[X]) \leq E[g(X)] \quad &\text{if } g(x) \text{ is a convex function} \\ E[g(X)] \leq g(E[X]) \quad &\text{if } g(x) \text{ is a concave function} \end{align*}$$` --- .content-box-green[**Visualization**] .panelset[ .panel[.panel-name[Example 1 : g(x) is convex] .left5[ Suppose that `\(g(x)=X^2\)`. ```r set.seed(356) x <- runif(1000, 0, 10) # /*===== Convex case: g(X)=X^2 =====*/ y <- x^2 figure_ex1 <- ggplot()+ geom_point(aes(x = x, y = y))+ # --- E[X] --- # geom_vline(xintercept = mean(x), color = "red", linetype = "dashed")+ annotate("text", x = mean(x)+1, y = 0.01, label = paste0("E[X]=", round(mean(x), 1)), size = 3, color = "red") + # --- E[g(X)] --- # geom_hline(yintercept = mean(y), color="blue", linetype = "dashed")+ annotate("text", x = 1, y = mean(y)+5, label = paste0("E[g(X)]=", round(mean(y), 1)), size = 3, color = "blue") + # --- g(E[X]) --- # geom_hline(yintercept = mean(x)^2, color="darkgreen", linetype = "dashed")+ annotate("text", x = 1, y = mean(x)^2-5, label = paste0("g(E(X))=", round(mean(x)^2, 1)), size = 3, color = "darkgreen") + theme_bw() ``` ] .right5[ <img src="data:image/png;base64,#recitation2_slides_files/figure-html/unnamed-chunk-13-1.png" width="100%" style="display: block; margin: auto;" /> `$$\color{darkgreen}{g(E[X])} \leq \color{blue}{E[g(X)]}$$` ] ] .panel[.panel-name[Example 2: g(x) is concave] .left5[ Suppose that `\(g(x)=\sqrt{x}\)`. ```r # /*===== Convex case: g(X)=X^(1/2) =====*/ y <- x^(1/2) figure_ex2 <- ggplot()+ geom_point(aes(x = x, y = y))+ # --- E[X] --- # geom_vline(xintercept = mean(x), color = "red", linetype = "dashed")+ annotate("text", x = mean(x)+0.8, y = 0.01, label = paste0("E[X]=", round(mean(x), 1)), size = 3, color = "red") + # --- E[g(X)] --- # geom_hline(yintercept = mean(y), color = "blue", linetype = "dashed")+ annotate("text", x = 1, y = mean(y)-0.2, label = paste0("E[g(X)]=", round(mean(y), 2)), size = 3, color = "blue") + # --- g(E[X]) --- # geom_hline(yintercept = mean(x)^(1/2), color = "darkgreen", linetype = "dashed")+ annotate("text", x = 1, y = mean(x)^(1/2)+0.2, label = paste0("g(E(X))=", round(mean(x)^(1/2), 2)), size = 3, color = "darkgreen") + theme_bw() ``` ] .right5[ <img src="data:image/png;base64,#recitation2_slides_files/figure-html/unnamed-chunk-15-1.png" width="100%" style="display: block; margin: auto;" /> `$$\color{blue}{E[g(X)]} \leq \color{darkgreen}{g(E[X])}$$` ] ] ] --- <!-- https://stackoverflow.com/questions/53481699/customize-font-size-for-all-the-slides-in-xaringan --> <!-- class: my-one-page-font --> # Solution to Exercise 2 We want to derive the PDF of `\(Y\)` which is defined by `\(Y=X^2\)`. Since `\(0 \leq X \leq 1\)`, `\(0 \leq Y \leq 1\)`. Recall the definition of PDF. Let `\(G(y)\)` be the CDF of Y and `\(g(y)\)` be the PDF of Y. Then, `\(g(y)=\frac{d}{dy}G(y)\)` by the definition of PDF. So, once you get `\(G(y)\)`, you can derive `\(g(y)\)`. So, let's consider the CDF of `\(Y\)`. By the definition of CDF, `$$G(y) = Pr(Y \leq y).$$` Since `\(Y=X^2\)`, `$$\begin{align*} G(y) &= Pr(X^2 \leq y)\\ &= Pr(0 \leq X \leq \sqrt{y}) \quad (\because X \ge 0).\\ \end{align*}$$` We know that `\(X \sim U[0,1]\)`. So, $$ G(y) = \int_{0}^{\sqrt{y}} 1 dx = \sqrt{y}. $$ Thus, `$$\begin{align*} g(y) = \frac{d}{dy}G(y) = \frac{1}{2\sqrt{y}} \quad (0 \leq y \leq 1) \end{align*}$$` ---